Solving the online batching problem using deep reinforcement learning
نویسندگان
چکیده
In e-commerce markets, on time delivery is of great importance to customer satisfaction. this paper, we present a Deep Reinforcement Learning (DRL) approach for deciding how and when orders should be batched picked in warehouse minimize the number tardy orders. particular, technique facilitates making decisions whether an order individually (pick-by-order) or batch with other (pick-by-batch), if so which We problem by formulating it as semi-Markov decision process develop vector-based state representation that includes characteristics system. This allows us create deep reinforcement learning solution learns strategy interacting environment solve proximal policy optimization algorithm. evaluate performance proposed DRL comparing several batching sequencing heuristics different settings. The results show able produces consistent, good solutions performs better than heuristics.
منابع مشابه
Deep Reinforcement Learning for Solving the Vehicle Routing Problem
We present an end-to-end framework for solving Vehicle Routing Problem (VRP) using deep reinforcement learning. In this approach, we train a single model that finds near-optimal solutions for problem instances sampled from a given distribution, only by observing the reward signals and following feasibility rules. Our model represents a parameterized stochastic policy, and by applying a policy g...
متن کاملProblem solving with reinforcement learning
This thesis is concerned with practical issues surrounding the application of reinforcement learning techniques to tasks that take place in high dimensional continuous state-space environments. In particular, the extension of on-line updating methods is considered, where the term implies systems that learn as each experience arrives, rather than storing the experiences for use in a separate oo-...
متن کاملthe algorithm for solving the inverse numerical range problem
برد عددی ماتریس مربعی a را با w(a) نشان داده و به این صورت تعریف می کنیم w(a)={x8ax:x ?s1} ، که در آن s1 گوی واحد است. در سال 2009، راسل کاردن مساله برد عددی معکوس را به این صورت مطرح کرده است : برای نقطه z?w(a)، بردار x?s1 را به گونه ای می یابیم که z=x*ax، در این پایان نامه ، الگوریتمی برای حل مساله برد عددی معکوس ارانه می دهیم.
15 صفحه اولA Reinforcement Learning Model for Solving the Folding Problem
In this paper we aim at proposing a reinforcement learning based model for solving combinatorial optimization problems. Combinatorial optimization problems are hard to solve optimally, that is why any attempt to improve their solutions is beneficent. We are particularly focusing on the bidimensional protein folding problem, a well known NP-hard optimizaton problem important within many fields i...
متن کاملSolving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method
In this paper, a new type of 3D bin packing problem (BPP) is proposed, in which a number of cuboidshaped items must be put into a bin one by one orthogonally. The objective is to find a way to place these items that can minimize the surface area of the bin. This problem is based on the fact that there is no fixed-sized bin in many real business scenarios and the cost of a bin is proportional to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers & Industrial Engineering
سال: 2021
ISSN: ['0360-8352', '1879-0550']
DOI: https://doi.org/10.1016/j.cie.2021.107221